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Abstract 

We reconsider the variational integration of optimal control problems for 
mechanical systems based on a direct discretization of the Lagrange- d'Alemhert 
principle as proposed in [4] . This approach yields discrete dynamical constraints 
which by construction preserve important structural properties of the system, 
like the evolution of the momentum maps or the energy behaviour. 

Here, we employ higher order quadrature rules based on polynomial colloca- 
tion. The resulting variational time discretization decreases the overall compu- 
tational effort. 

1 Introduction 

In recent years, much effort in designing numerical methods for the time integration 
of (ordinary) differential equations has been put into schemes which are structure pre- 
serving in the sense that important qualitative features of the original dynamics are 
preserved in its time discretization, c/. the recent monograph [2]. A particularly ele- 
gant way to, e.g., derive symplectic integrators is by discretizing Hamilton's principle 
as suggested by [6, 7], see also [3]. 

Evidently, structure preservation might equally be important in optimal control 
problems. In fact, in [4] a new approach^ to the transcription of a mechanical optimal 
control problem into a finite dimensional nonlinear programming problem has been 
proposed which is based on a direct discretization of the Lagrange- d'Alemhert princi- 
ple (instead of the associated Euler-Lagrange differential equations of motion). This 
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approach yields a finite-difference type discretization of the dynamical constraints of 
the problem which by construction preserves important structural properties of the 
system, like the evolution of the momentum maps associated to the symmetries of the 
Lagrangian or the energy behaviour [3, 4]. 

So far, quadrature rules of second order have been used in order to approximate 
the action functional of the system. In this work, we employ higher order rules based 
on polynomial collocation as suggested in [3] for variational integrators. This decreases 
the overall computational effort of the approach, while maintaining its structure preser- 
vation properties. 

The paper is structured as follows: In Section 2 we introduce the mechanical opti- 
mal control problem under consideration, while in Section 3 we describe its variational 
discretization using higher order polynomial collocation. We present two numerical 
experiments in Section 4 and conclude by outlining future research directions in Sec- 
tion 5. 

2 Mechanical optimal control problems 

We consider a mechanical system with configuration manifold Q together with a 
Lagrangian L : TQ M., k > 2, where the associated state space TQ describes the 
position and velocity of a particle moving in the system. Usually, the Lagrangian takes 
the form of kinetic minus potential energy, L{q, q) = K{q, q) — V{q) = \ q^ ■ M{q) ■ q — 
V{q), for some (positive definite) mass matrix M{q). 

Given a time interval [0,T], the Lagrangian defines an integral action & : C^([0,T], 
Q) — M on the space of twice differentiable paths q: [0,T] — )■ Q, namely: 



Note that C^([0, T], Q) is a infinite dimensional smooth manifold and that C5 is of class 
C^. To be rigorous, C^([0, T],Q) should be completed with some norm to form a Banach 
space, although this falls out of the scope of the present work. 

The prinpicle of least action, also known as Hamilton's principle, establishes that 
a "particle" moving in Q from a point g*^ G Q to a point q^ E Q will do so along a 
path q: [0, T] — )■ Q such that g(0) = g° and q(T) = q^ and q itself is an extremal of (S. 
An extremal of the integral action is a trajectory q e C^([0,T],(5) so that, for any 
infinitesimal variation 5q G TC^([0, T], Q) = C^{[Q,T],TQ) of q {i.e. TQo6q = q, where 
tq: TQ —> Q is the canonical projection), we have that 




Se{q)-6q = 0. 
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Using integration by parts and considering infinital variations that are null at the end 
points, Sq{0) = 6q(T) = 0, it is easy to show that the previous expression is equivalent 
to the following one 

dL _ d_dL _ ^ 

dq dt dq ' 
which is the well known Euler- Lagrange equation. 

The trajectories resulting from the previous second order differential equation are 
free in the sense that no force exerts any influence on the mechanical system, which 
is a rather simplistic assumption. Otherwise and more generally, if an external force 
/: TQ T*Q exerts some influence on the system, the principle of least action is 
then know as the principle of virtual work or Lagrange- d'Alembert principle, which 
states that in presence of an external force the extremal trajectories must satisfy the 
variational equation 

5 r L{q{t), dt + r q{t)) " Sqit) dt = , (1) 



for any infinitesimal variation Sq that is null at the end points. This equation is in 
turn equivalent to the forced Euler-Lagrange equation 

dL _ d dL ^ ^ _^ 
dq dt dq 

When the Lagrangian is regular, that is when the velocity Hessian matrix d'^L/dq^ is 
non-degenerate, the Euler-Lagrange equation (2) has a unique solution q G C^([0, T], Q) 
for each initial condition g°) G TQ (rather than for boundary conditions q^, q^ G 
Q). Therefore, it defines a map Fl: TQ x R ^ TQ hj FL{q°,q°,t) := {q{t),q{t)). In 
fact, Fl is a C'^'~^-diffeomorphism (for fixed t G M) called the Lagrangian flow of the 
system. 

Imagine now that we want to control the trajectories of the system following a 
minimal cost criteria. We must then first assume that the mechanical system may 
be driven by means of some control parameter u E U G R^ on which the external 
force / will therefore depend. We also assume that an infinitesimal cost function 
C : TQ Xq T*Q ^ M is given associated to the objective functional Z : C^([0, T],Q) x 
C([0,T],f/) ^ M 

Ziq, u) = r C{q{t), m, fiq{t), g(t), w(t))) dt . (3) 







We seek for pairs of curves (g, u): [0,T] ^ Q x U such that q, under the influence 
of the external force f{q{-), q{-), u{-)), moves from a given initial state (g°, g°) G TQ to 
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a given final state (g-^, q^) G TQ while minimizing the objective functional That is, 
we seek to solve the mechanical optimal control problem 




(4b) 



(4a) 



(4c) 



As stated above, if L is regular, then (4b) defines a unique solution q G C^([0,T], 
Q) (for fixed u G C([0, T], f/)) satisfying the initial condition (4c). Note that the 
assumption on u to be continuous is rather restrictive from the point of view of typical 
applications. Since the focus of this paper is on the approximation of the solution 
curve q by (higher order) polynomials we nevertheless restrict our attention to this 
case here. In the following we assume the regularity of L as well as the existence of 
an optimal solution {q*,u*) to the mechanical optimal control problem and focus on 
the integration of the dynamical part in order to solve the optimal control problem 
numerically. 

3 Higher order variational discretization 

Solution methods for optimal control problems can be divided into indirect and direct 
approaches (cf. [1]). While the indirect approach bases on the solution of the necessary 
optimality conditions, the direct approach transforms the problem into a finite dimen- 
sional restricted optimization problem by a discretization of the forced Euler-Lagrange 
equation (2). In this work, we follow a direct approach, however instead of dscretizing 
the equation of motion, we discretize the variational principle (1) (cf. [4]). 

In order to do so, we use piecewise polynomial approximations to the trajectories 
and numerical quadrature to approximate the integrals following [3]. To this end, we 
divide the interval [0,T] into smaller subintervals Ik = [tk,tk+i] (to = 0, t]\r = T) 
of fix length h = T/N and on each of these subintervals we perform the following 
construction: We approximate q: ^ Q and u: Ik ^ U hj polynomials qf : [0, h] ^ Q 
and uf. : [0, /i] — )■ f/ of degree s and m, respectively. Given intermediate times = do < 
di < ■■■ < ds-i < ds = I and intermediate points qk = ql, ql-, ■ ■ ■ ■, ql'^^lk = 1k+ii 
the interpolating polynomial of degree s with qf{d^h) = q'j^ for z/ = 0, . . . , s is 
uniquely defined. Analogously, we choose a set of interior times = do < di < ■ ■ ■ < 
dm-i < dm = I and interior points . . . G U that represent a parametrization 
of the space of polynomials uf : [0, /i] — )■ f/ of degree m. Obviously, to ensure the 
proper definition of the polynomials implies the assumption of a linear structure on Q, 
canonical or taken, for instance, from a global chart. 
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Note that by identifying the polynomials qf. with the set of intermediate points 
. . . , (resp. with . . . , u^), we are imphcitely identifying the set of poly- 
nomials of order s from [0, h] to Q with Q^^^ (resp. with ^™'+^) and, therefore, its 
tangent space with (Bs-\-iTQ (resp. with ®m+iTU). 

As for the derivation of the continuous time dynamical equations, the control pa- 
rameter plays no role in the derivation of the discrete time dynamical equations. In 
order to simplify notation we therefore ommit the explicit dependence on u and also 
write f{cih) in order to denote f{q{cih),q{cih),u{cih)) in the following. 



Higher order discrete mechanics We approximate the action integral on [0, h] by 
numerical quadrature. 



Ld{ql,ql...,ql) := hJ2biL{qt{c^h),qt{cih)) ^ / L{q,q)dt 

i=l -^^k 



(5) 



where q G [0,1], i = l,...,r, are the quadrature nodes and bi the corresponding 
weights. Lrf is called a multipoint discrete Lagrangian. The discrete action sum over 
the entire trajectory on [0,T] is then 



N-l „T 



Similarly, we approximate the force integral in Equation (1) on J^. Using the short- 
hand notation f{t) = f {qtif) Ati'^)) define the multipoint discrete forces by 

fl = rM, ...,qt):=hj2 bjm^^ . (6) 



i=l 



Then, the force integral is approximated by 



s r „ 

Fd{ql ...,ql)- 6{ql . . . , ql) := fk ' K = hj^ ^^fi^^h) ■ 6qt{c.h) ^ f-Sqdt, 

u=Q i=l "^^fc 

where Sq'^it) denotes variations of given as 

^qtit) = E ^f^hk- Thus, the discrete 
force integral over the entire trajectory on [0, T] is 

U{{ql---,qi)}k=o) = Y.PM,---,ql)^ / f-^q^t- 

k=o -^0 

Having defined the discrete Lagrangian action (3d and the discrete action of the 
force ^d, we require that the discrete version of the Lagrange-d'Alembert principle 
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holds for variations of q^. That is, a sequence of points {(g^, . . . , ql)}k=o is an extremal 
trajectory for the system if, for any variation {S{q1, . . . , ql)}^^Q with 6ql = ^g^f-i = O5 
we have that 

{6&, + d,)-{6{ql...,ql)}^-,' = 0. (7) 

A simple computation shows that the extended set of discrete Euler-Lagrange equations 
can be derived as 

Ds+iLd{ql ...,ql) + DiLd{ql+„ . . . , g^+i) + + fj^+^ = , (8a) 
D.+Mql . . . + = 0, u = l,...,s-l. (8b) 

for k = 0, . . . , N — 1, where 

9Ld ^ fdL dqiicjh) dL dqt{c,h) \ 

Alternative construction The previous construction is a direct derivation of the 
discrete Euler-Lagrange equations, which can also be derived in a two-step construction. 
Since this alternative construction is a reinterpretation of the previous one, we should 
not get into much detail. 

In a first step, we approximate the continuous integral action and force integral by 
the already used quadrature for (cj, Wi), giving rise to (still) continuous action sum and 
force &' and ^' . Then we define a discrete Lagrangian Ld : Q x Q ^ that, for two 
points go, qi ^ Q "sufficiently close" and a time step h > 0, gives the value 

L,(go,gi) := ^'iq'), 

where q'^ satisfies the Lagrange-d'Alembert principle for & and ^' restricted to the class 
of polynomials of degree s joining go and gi in time h. With the proper interpretation, 
one may check that the polynomial g'^ is in fact characterized by the equations (8b). 
Note that, even if it is not explicitely stated, q'^ and therefore Ld depend on the external 
force /. 

In the second step, we define a discrete action sum on Q^^^ by 

N-1 

^diqo, • • • , ^Tv) = X] ^diqk, Qk+i) 

and discrete forward and backward forces 
Uiqo, qi) ■ S{qo, qi) := ^\q') ■ %5q' and /."(go, gi) ■ 5(go, gi) := ^\q') ■ %5q' , 
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where q'^ is given as before. Applying now the discrete Lagrange-d'Alembert principle 
to discrete action sum 0^ and force 

N-l 

dd{qo, - ■ ■ ,qN) ■ S{qo,. . . ,qN) ■= [fd ((Ik, Qk+i) + fd {^k, qk+i)] ■6{qk,qk+i), 

for any variation 6{qo, . . . , g^v) such that 6qo = 6qj\f = 0, we obtain 

A^d(gfc-i, Qk) + DiLdiqk, qk+i) + //(^fc-i, Qk) + fdi^k, Qk+i) = , 

for any k = 1, . . . , N — 1, which is equivalent to (8a) (with the obvious identifications). 

The main advantage of this alternative construction is that, as it may be seen from 
a two-point discrete Lagrangian approach, one can make the most of the usual theory. 
For instance, we may define the forward and backward Legendre transformations 

F+(go,gi) := {qi,D2Ldiqo,qi) + Uiqo,qi)) e T*Q , 

iF"(go,gi) := (go, -^i^d(go,gi) - /d"(go,gi)) eT*Q, 

which are useful to implement the initial and final condition of the optimal control 
problem from a momentum description or to give the momenta associated to the macro 
nodes qo, . . . ,q]\f along the trajectory. 

Discrete approximation of the objective functional The objective functional Z 
is approximated by a discrete objective function Zd using the same numerical quadra- 
ture rule and the same polynomials and uf as above. For each time interval Ik, we 
define the multipoint discrete cost function 

Cd{ql ...,ql,ul...,u^):=hJ2 hC{qt{c.h\ qt{c.h) , ut{c.h)) ^ j C{q, g, u) dt, 

which defines the discrete objective function over the entire trajectory [0, T] 

N-l T 

UM, ...,qlul,..., Olfjo') := E ■■■,qlul...,u^)^ C(g, q, u) dt. 

k=0 "^0 

The discrete version of the optimal control problem is then to minimize ^d subject 
to the discrete equations (8) and discretized boundary constraints, namely 

q,u 

s.t. Ds+iLd{ql ...,ql) + D,Ld{ql^,, . . . , q^,) + ft + fl^, = (9b) 
D,+iL,(gO,...,g^) + /,- = 0, = 1, . . . , s - 1, A; = 0, . . . , AT - 1 (9c) 

(%'(0),%'(0)) = (g°,g°), {q%-,{T)A%-i{T)) = (g^,g^) (9d) 

This restricted optimization problem can be solved by standard optimization techniques 
such as e.g. SQP methods. 
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n 


3 


4 


5 


6 




Xj 








Xj 




Xj 









4/3 




5/6 





32/45 


±^^(7-2v^) 


^(14 + v/7) 




±1 


1/3 


±1 


1/6 


±1 


49/90 
1/10 


±y/^(7 + 2v^) 
±1 


^(14 -v^) 
1/15 



Table 1: Lobatto's quadrature: points and weights 



4 Results 



As proposed in [3] for uncontrolled systems, we use Lagrange polynomials for the 
construction of and uj.. For both of them, the collocation points will coincide with 
the quadrature points of the corresponding Lobatto's quadrature, which will be the 
same quadrature rule to approximate all the integrals: Action, force and objective 
function. According to this, quadrature points and weights are given by Table 1 after 
rescaling from [—1,1] to [0,1], i.e. dy = = c^^i = {x^+i + l)/2 and bi = Wi/2. 
Polynomials qf (and similarly uf) will then take the form 



where are the base of Lagrangian polynomials of degree s 

/.=0,...,s 

We then have that 



P.{t/h) and qi{t) = \Y,P,{t/h).ql. 



h 

We point out here a slight difference with respect to [3]: In there the authors propose 
numerical quadratures for the uncontrolled system with r = s, obtaining a variational 
integrator of order 2s — 2; while the numerical quadrature we use has the same number 
of nodes as the polynomials, r = s + 1, obtaining a variational integrator of order 2s 
(cf. [5]). The numerical scheme for the optimal control problem including the states 
as well as the adjoint variables coming from the necessary optimality conditions then 
inherits this property and shows from numerical simulations (see below) a convergence 
order of 2s, as one might expect. 

The numerical scheme is implemented in Matlab (R2011b) using the built-in func- 
tion fmincon to solve the non-linear programming problem. We ran several numeric 
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jj (nodes) 


2^ = 4 


2^ = 8 


2^ = 16 


2^ = 32 


2^ = 64 


err(g) 


1.92 • 10"^ 


2.48 ■ 10"^ 


1.65 ■ 10"^ 


1.07 • 10-5 


5.70 ■ 10-^ 


err('u) 


6.24 • 10-1 


1.06 ■ 10-2 


7.22 ■ 10-^ 


4.53 • 10-5 


8.67-10-6 


err(A) 


7.71 ■ 10-1 


2.11 ■ 10-2 


1.44- 10-3 


9.02 ■ 10-5 


5.73 ■ 10-6 


tl(iter.) 


12 


18 


21 


30 


26 


tl(constr.) 


247 


665 


1.474 


3.931 


6.735 




10.386 


51.894 


221.148 


1.155.810 


3.919.962 



Table 2: Controlled harmonic oscillator (with s = 2) 



simulations of the controlled harmonic oscillator with Lagrangian L{q, q) = ^{q^ — ^q"^), 
force f{q, q,u) = u and cost function C{q, q, u) = v^. In these simulations, the system 
is driven from an initial steady state (0, 0) to a final steady state (1, 0) in T = 5 seconds 
time. 

Table 2 represent the output obtained from simulations where the trajectories are 
approximated by polynomials of order s = 2. The number of macro nodes ranges 
from 2^ to 26; err(q'), err(u), err(A) represent the error committed on the trajectory, 
controls and adjoint variables, respectively (which are the max. at the coincident macro 
nodes); tl(iter.) is the number of iterations done by fmincon; and ft(constr.) and '^{DL) 
are the number of evaluations of the constraints and the Jacobian of the Lagrangian. 
Since there is no exact explicit solution, for the error comparisons, we consider the 
data obtained for = 2^ as the exact one. As expected, we may observe that the 
error convergence is of order 4. 

Due to the higher order of the numerical scheme, one may easily surpass the machine 
precision and, therefore, it is difficult to obtain better numerical results. For instance, 
this is the case for s = 5, where the order of the method is expected to be 10, which 
surpasses machine precision already for a time-step of length h = 0.01. Besides, the 
particularities of the non-linear solver in use play also an important role (in our case, 
fmincon). 

In Table 3, we compare simulations of lower order with smaller time-step, with 
simulations of higher order with bigger time-step. From the result, we may observe 
that by increasing the order of the method, we may decrease the number of needed 
nodes, while conserving a comparable accuracy. The benefits are clear: Less nodes 
means less variables, lower memory cost, lower cpu time consumption. For instance, 
for s = 5 and A^ = 2^, the number of evaluations of the Lagrangian is two orders of 
magnitudes lower than for s = 2 and N = 2^. 

Note that, since the Jacobian of the constraints is provided by finite differences, 
the number of constraints evaluations scales linearly with the number of nodes. The 
use of automatic differentiation [8] for the computation of derivatives would decrease 
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s = 2 


s = 5 




s = 2 


s = 5 


jj (nodes) 


2^ = 32 


2^ = 4 




(J (nodes) 


2^ = 256 


2^ = 8 


err(gj 


1.07 • 10-5 


1.16 • 10-5 




err(g) 


1.94 ■ 10-'^ 


4.66 ■ 10-'' 


err('u) 


4.53 ■ 10-5 


3.41 ■ 10-5 




err(-u) 


7.31 ■ 10-6 


2.00 ■ 10-5 


err(A) 


9.02 ■ 10-5 


6.45 ■ 10-5 




err(A) 


6.64- 10-^ 


9.51 ■ 10-^ 


tl(iter.) 


30 


27 




tj(iter.) 


37 


31 


(j(constr.) 


3.931 


1.204 




tj(constr.) 


38.043 


2.656 




1.155.810 


187.848 






87.880.098 


796.848 



Table 3: Order comparison: s = 2 vs. s = 5 



the number of evaluations and improve the performance of the numerical scheme. 

5 Conclusions 

In this work, discrete variational mechanics has been applied to numerically solve opti- 
mal control problems for mechanical systems. In extension to [4], the use of higher order 
quadrature rules for the discretization leads to higher order optimal control schemes. 
These schemes are, for a prescribed accuracy of the discrete optimal solution, com- 
putationally more efficient than lower order schemes. For the future, the convergence 
rates of the optimal control scheme, that have been observed numerically, have to be 
proven. Furthermore, based on the discrete variational framework to construct varia- 
tional schemes of arbitrary order, we plan to derive discretization schemes that adapt 
the order depending on the state of the system. 
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